Path Enumeration

Explore the first solution of the Naive Trees Antipattern.

We'll cover the following

Path Enumeration in directory hierarchies
Creating Comments table with a path column
Querying the Comments table
Path Enumeration oddity

One weakness of an Adjacency List is that it’s expensive to retrieve the ancestors of a given node in the tree. In Path Enumeration, this is solved by storing the string of ancestors as an attribute of each node.

Path Enumeration in directory hierarchies#

We can see a form of Path Enumeration in directory hierarchies. A UNIX path like /usr/local/lib/ is a Path Enumeration of the filesystem, where usr is the parent of local, which in turn is the parent of lib.

Creating `Comments` table with a `path` column#

To employ the Path Enumeration solution, we can define a column called path as a long VARCHAR, instead of the parent_id column, in the Comments table. The string stored in this column is the sequence of ancestors of the current row, from the top of the tree down, just like a UNIX path. We can even choose / as a separator character.

Enter to Rename, Shift+Enter to Preview

Creating the Comments table

Querying the `Comments` table#

This section will experiment on the table with some queries to find out the ancestors and the descendants.

Finding out ancestors of a comment#

We can query ancestors by comparing the current row’s path to a pattern formed from the path of another row. For example, we will use the following query to find the ancestors of comment 7, whose path is 1/4/6/7/.

Enter to Rename, Shift+Enter to Preview

Finding the ancestors of comment 7

This matches the patterns formed from paths of ancestors 1/4/6/%, 1/4/%, and 1/%.

Finding out the descendants of a comment#

We can query descendants by reversing the arguments of the LIKE predicate. To find the descendants of comment 4 whose path is 1/4/, we can use the following query:

Enter to Rename, Shift+Enter to Preview

Retrieving the descendants of comment 4

The pattern 1/4/% matches the paths of descendants 1/4/5/, and 1/4/6/, and 1/4/6/7/.

Counting the number of comments#

Once we can select a subset of the tree or the chain of ancestors to the top of the tree, we can perform many other queries easily, such as computing the SUM() of costs of nodes in a subtree or simply counting the number of nodes. For example, to count the comments per author in the subtree starting at comment 4, we can do this:

Enter to Rename, Shift+Enter to Preview

Counting comments per author in a subtree

Test it yourself#

Try to run the queries in the following playground:

/

main.sql

main.sql

Enter to Rename, Shift+Enter to Preview

Test it yourself editor

You can copy and paste the queries in the playground to see the effect on the database.

How manipulation in a tree affects the model#

The process of inserting a node is similar to that of inserting in the Adjacency List model. We can insert a non-leaf node without needing to modify any other row.

To do this, let’s copy the path from the new node’s parent, and append the ID of the new node to this string. If our primary key generates its value automatically during the insert, we may need to insert the row and then update the path once we know the ID value for the new row. For example, if we use MySQL, the built-in function LAST_INSERT_ID() returns the most recent ID value generated for an inserted row in the current session. We can get the rest of the path from the parent of our new node.

Enter to Rename, Shift+Enter to Preview

Inserting another record in the Comments table

Let’s retrieve the data in the Comments table after inserting a record.

Enter to Rename, Shift+Enter to Preview

Retrieving data after inserting a record in the Comments table

Path Enumeration oddity#

Path Enumeration has some drawbacks similar to those shown in the chapter Jaywalking. The database can’t enforce that the path is formed correctly or that values in the path correspond to existing nodes. Maintaining the path string depends on the application code, and verifying it is costly.

Moreover, no matter how long we make the VARCHAR column, it still has a length limit, so it doesn’t strictly support trees of unlimited depth.

Path Enumeration allows us to sort a set of rows easily by their hierarchy, as long as the elements between the separator are of a consistent length.

Solution: Use Alternative Tree Models

Nested Sets

Mark as Completed

Report an Issue

Path Enumeration

Path Enumeration in directory hierarchies#

Creating Comments table with a path column#

Querying the Comments table#

Finding out ancestors of a comment#

Finding out the descendants of a comment#

Counting the number of comments#

Test it yourself#

/

How manipulation in a tree affects the model#

Path Enumeration oddity#

Creating `Comments` table with a `path` column#

Querying the `Comments` table#